## ─ Attaching packages ──────────────────── tidyverse 1.3.1 ─
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.5 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ─ Conflicts ───────────────────── tidyverse_conflicts() ─
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Linking to GEOS 3.8.1, GDAL 3.2.1, PROJ 7.2.1
## To enable
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
##
## 载入程辑包:'censusapi'
## The following object is masked from 'package:methods':
##
## getFunction
The plot us derived from annual mean concentration of PM2.5 (weighted average of measured monitor concentrations and satellite observations, µg/m3), (Air Monitoring Network, Satellite Remote Sensing Data; California Air Resources Board (CARB))over three years (2015 to 2017). Generally speaking, PM2.5 index is relatively average with Vallejo and Oakland being slightly higher, which can also be seen from the figure where there is little color difference between different regions. Mobile emissions from motor vehicles, ships, planes, and trains comprise the largest source of air pollution in PM2.5. Wildfires are an additional important source of PM2.5 in California as smoke particles fall almost entirely within the size range of PM2.5.
The plot is derived from spatially modeled, (California ZIP codes)age-adjusted,(using Tracking California) rate of ED visits (Emergency Department and Patient Discharge Datasets from the State of California, Office of Statewide Health Planning and Development (OSHPD)) for asthma per 10,000 (averaged over 2015-2017). Asthma increases an individual’s sensitivity to pollutants. A study found that there was an increase in asthma diagnosis following increases in ambient air pollution and exposure to certain pesticides. Vallejo, Antioch and Oakland has the highest index in term of asthma rate, more than triple the CA number(52.14), this may be due to the high density of freeways throughout the county , pesticide used on agriculture crops like plane spraying crops with chemicals on a reoccurring basis and second hand smoke coupled with poor housing conditions.
## `geom_smooth()` using formula 'y ~ x'
When THE PM2.5 value was small, the curve fitting was better, but when THE PM2.5 value was large, the specific value of asthma had a large residual difference with the fitted value, which was outside the curve and had a poor correlation with PM2.5, proving that there were other causes causing asthma
##
## Call:
## lm(formula = Asthma ~ PM2.5, data = combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.47 -25.89 -9.61 12.94 182.95
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -116.278 13.040 -8.917 <2e-16 ***
## PM2.5 19.862 1.534 12.950 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 37.49 on 1578 degrees of freedom
## (因为不存在,1个观察量被删除了)
## Multiple R-squared: 0.09606, Adjusted R-squared: 0.09549
## F-statistic: 167.7 on 1 and 1578 DF, p-value: < 2.2e-16
An increase of 19.862 in Asthma is associated with an increase of 1 in PM2.5 ; 9.6% of the variation in Asthma is explained by the variation in PM2.5
As is shown in the plot, the mean of the residual is negative, and there appears to be a sharp skew to the left of the density curve for the residuals while it should be a symmetric bell curve centered at 0.
## `geom_smooth()` using formula 'y ~ x'
##
## Call:
## lm(formula = logAsthma ~ PM2.5, data = combined)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.00402 -0.46479 0.03313 0.42298 1.75525
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.69234 0.22840 3.031 0.00248 **
## PM2.5 0.35633 0.02686 13.264 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6566 on 1578 degrees of freedom
## (因为不存在,1个观察量被删除了)
## Multiple R-squared: 0.1003, Adjusted R-squared: 0.09974
## F-statistic: 175.9 on 1 and 1578 DF, p-value: < 2.2e-16
An increase of 0.356 in log(Asthma) is associated with an increase of 1 in PM2.5 ; 10% of the variation in Asthma is explained by the variation in PM2.5
After applying a logarithmic transformation to the model, the curve as a whole is closer to a normal distribution but the normal distribution is concave downwards rather than upwards.